Fusion of multiple approximate nearest neighbor classifiers for fast and efficient classification
نویسندگان
چکیده
The nearest neighbor classifier (NNC) is a popular non-parametric classifier. It is a simple classifier with no design phase and shows good performance. Important factors affecting the efficiency and performance of NNC are (i) memory required to store the training set, (ii) classification time required to search the nearest neighbor of a given test pattern, and (iii) due to the curse of dimensionality it becomes severely biased when the dimensionality of the data is high with finite samples. In this paper we propose (i) a novel pattern synthesis technique to increase the density of patterns in the input feature space which can reduce the curse of dimensionality effect, (ii) a compact representation of the training set to reduce the memory requirement, (iii) a weak approximate nearest neighbor classifier which has constant classification time, and (iv) an ensemble of the approximate nearest neighbor classifiers where the individual classifier’s decisions are combined based on the majority vote. The ensemble has constant classification time upperbound and according to empirical results, it shows good classification accuracy. A comparison based on empirical results is shown between our approaches and other related classifiers. 2004 Elsevier B.V. All rights reserved.
منابع مشابه
Identification of selected monogeneans using image processing, artificial neural network and K-nearest neighbor
Abstract Over the last two decades, improvements in developing computational tools made significant contributions to the classification of biological specimens` images to their correspondence species. These days, identification of biological species is much easier for taxonomist and even non-taxonomists due to the development of automated computer techniques and systems. In this study, we d...
متن کاملFusion of Different Corneal Parameters to Improve the Diagnosis of Keratoconus
Purpose: To diagnose keratoconus from healthy eyes, as well as suspected keratoconus. Methods: Certain parameters were extracted from Casia, Corvis, and Pentacam HR devices for 3 groups of healthy, with keratoconus, and suspected keratoconus. This study was performed on 340 eyes with keratoconus, 310 normal eyes, and 350 suspected keratoconus. The processing method involved the fusion of featur...
متن کاملA Classification Method for E-mail Spam Using a Hybrid Approach for Feature Selection Optimization
Spam is an unwanted email that is harmful to communications around the world. Spam leads to a growing problem in a personal email, so it would be essential to detect it. Machine learning is very useful to solve this problem as it shows good results in order to learn all the requisite patterns for classification due to its adaptive existence. Nonetheless, in spam detection, there are a large num...
متن کاملOn Competitiveness of Nearest-Neighbor-Based Music Classification: A Methodological Critique
The traditional role of nearest-neighbor classification in music classification research is that of a straw man opponent for the learning approach of the hour. Recent work in high-dimensional indexing has shown that approximate nearest-neighbor algorithms are extremely scalable, yielding results of reasonable quality from billions of highdimensional features. With such efficient large-scale cla...
متن کاملSoftware Cost Estimation by a New Hybrid Model of Particle Swarm Optimization and K-Nearest Neighbor Algorithms
A successful software should be finalized with determined and predetermined cost and time. Software is a production which its approximate cost is expert workforce and professionals. The most important and approximate software cost estimation (SCE) is related to the trained workforce. Creative nature of software projects and its abstract nature make extremely cost and time of projects difficult ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Information Fusion
دوره 5 شماره
صفحات -
تاریخ انتشار 2004